DGA: Direction-Guided Attack Against Optical Aerial Detection in Camera Shooting Direction-Agnostic Scenarios

Yue Zhou
Shuqi Sun
Xue Jiang
Guozheng Xu
Fengyuan Hu
Ze Zhang
Xingzhao Liu

Shanghai Jiao Tong University

TGRS 2024

[Paper]
[Code]
[Dataset]
[BibTex]



Abstract

Patch-based adversarial attacks have increasingly aroused concerns due to their application potential in military and civilian fields. In aerial imagery, numerous targets exhibit inherent directionality, such as vehicles and ships, giving rise to the emergence of oriented object detection tasks; similarly, adversarial patches also exhibit intrinsic orientation due to their lack of perfect symmetry. Existing methods presuppose a static alignment between the adversarial patch's orientation and the camera's coordinate system -- an assumption that is frequently violated in aerial images, whose effectiveness degrades in real-world scenarios. In this paper, we investigate the often-neglected aspect of patch orientation in adversarial attacks and its impact on camouflage effectiveness, particularly when the orientation is not congruent with the target. A new Directional Guided Attack (DGA) framework is proposed for deceiving real-world aerial detectors, which shows robust and adaptable attack performance in camera shooting direction agnostic (CSDA) scenarios. The core idea of DGA is to utilize affine transformations to constrain the relative orientation of the patch to the target and introduce three types of loss to reduce target detection confidence, make the color printable, and smooth the patch color. We introduce a direction-guided evaluation methodology to bridge the gap between patch performance in the digital domain and its actual real-world efficacy. Moreover, we establish a drone-based vehicle detection dataset (SJTU-4K), which labels the orientation of the target, to assess the robustness of patches under various shooting altitudes and views. Extensive proportionally scaled and 1:1 experiments are performed in physical scenarios, demonstrating the superiority and potential of the proposed framework for real-world attacks.


SJTU-4K Dataset


To collect the data we require, we design a reasonable scheme for data capture. First, we chose 20 scenes on our campus as our experimental site, including streets and car parks, where many kinds of cars are often seen. Afterward, we used the DJI Mini 3 drone as the capturing tool. In our scheme, the flight height ranges from 20 m to 120 m, and there were 9 flight heights in total. The resolution to the raw images is 4000 * 2260 pixels. We also provide aerial images captured at different pitch angles and directions, which are closer to real scenes. Figure 1 shows several images captured by our drone with different pitch angles from the heights of 20 m, 40 m, 80 m, and 110 m, respectively. The size of the objects is quite diverse, and the number of objects in a single image is large, especially for images captured at a great height. The detailed distribution of data in the SJTU-4K dataset is shown in Figure 2. However, this size is too large, being unsuitable for our experiment. Therefore, we need to conduct some processing of the raw images. They are tailored into a smaller size of 1024 * 1024 when being attacked. Ultimately, there are 10260 training images with 57172 car objects and 6345 testing images with 19969 car objects. To cooperate with the new patch-based attack evaluation method proposed, we used rotated boxes to label vehicles. Unlike the common rotated box representation, we use a 360-degree angle representation to retain the direction information of vehicles.


Experiments on DGA